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I.  INTRODUCTION 


Missile  flight  and  performance  characteristics  are  Investigated  using 
several  different  types  of  testing,  including  actual  flight  tests,  laboratory 
tests  of  major  system  components,  and  computer  simulation  studies.  For  example, 
in  addition  to  flight  tests,  extensive  laboratory  tests  of  the  electronic  sub¬ 
systems  and  wind  tunnel  tests  of  the  airframe  are  conducted.  Computer  simula¬ 
tion  models  may  be  used  to  gain  further  information  about  missile  performance 
variability  at  a  particular  set  of  flight  conditions  (such  as  target  type, 
speed,  altitude,  and  maneuver;  environmental  faetora  including  wlndspeed,  tem¬ 
perature,  and  visibility,  eto.),  or  they  may  be  used  to  explore  sets  of  con¬ 
ditions  where  no  flight  tests  have  been  performed.  Various  types  of  computer 
simulation  models  have  been  used  in  analyzing  missile  systems,  including  pure 
digital  simulators,  hybrid  digital-analog  simulators,  and  hardware-in-the-loop 
simulators . 

Model  validation  is  an  Important  aspect  of  using  computer  simulation  as 
part  of  the  test  program  for  a  missile  system.  By  validation  we  mean  an 
investigation  of  the  consistency  of  the  simulation  model  with  the  real  missile 
system.  Successful  validation  provides  a  basis  for  oonfidenoe  in  the  model's 
results,  and  is  a  necessary  step  if  the  model  is  to  be  used  to  draw  inferences 
about  the  behavior  of  the  real  missile.  A  reasonable  definition  of  validity  is 
that  a  set  of  input  conditions  to  the  model  should  produce  output  similar  to 
that  produced  by  the  real  missile  system  when  it  was  exposed  to  the  same  input. 
Consequently,  methods  for  comparing  computer  simulation  model  output  to  data 
generated  during  actual  flight  tests  are  typically  used  for  model  validation. 

This  paper  is  a  review  of  methods  useful  for  validation  of  computer  simu¬ 
lation  models  of  missile  systems.  Most  of  these  methods  discussed  are 
statistically-based.  For  other  more  general  discussions  of  the  validation 
question,  see  Naylor  and  Finger  (17),  Van  Horn  (23)(24),  and  Kheir  and  Holmes 
(12). 

II.  VALIDATION  METHODOLOGY 

Missile  performance  data  may  be  classified  as  either  statlo  or  dynamic. 
Examples  of  static  performance  data  are  kill  probabilities  and  terminal  miss 
distances.  Dynamic  performance  characteristics  are  output  phenomena  that  vary 
continuously  during  missile  flight,  such  as  roll  position,  roll  rate,  wing 
deflection,  system  gain  and  phase,  and  various  guidance  system  parameters. 

These  characteristics  are  usually  expressed  as  time  series.  We  now  briefly 
describe  acme  of  the  more  important  validation  techniques,  and  provide  a  list  of 
references  that  discuss  the  procedures  in  more  detail. 

Static  Data  Analysis  Methods  -  Many  published  discussions  of  simulation 
model  validation  foous  on  static  output  analysis.  Standard  statistical  proce¬ 
dures,  such  as  hypothesis  testing  methods,  oonfidenoe  intervals,  and  regression 
analysis,  can  be  used  in  this  context.  There  are  also  a  number  of  statistical 
techniques  developed  especially  for  use  in  the  simulation  environment. 

The  specific  statistical  methodology  used  depends  on  the  type  of  simula¬ 
tion  model.  For  example,  if  Monte  Carlo  simulation  is  used,  then  by  the  prooess 
of  replication  using  different  random  number  seeds  different  realizations  of  the 


output  variable,  say  Xj ,X£, . *»,xn  can  be  obtained.  These  observations  may  be 
viewed  as  a  random  sample  from  some  £opulation  f(x)  with  mean  y  and  variance  a2. 
The  sample  mean  and  sample  variance  x  and  s2  are  unbiased  estimators  of  y  and  - 
a2,  and  if  the  distribution  f(x)  is  not  too  different  from  the  normal  distribu¬ 
tion,  then  relatively  standard  statistical  methods  may  be  used  to  draw  inferen¬ 
ces  about  these  parameters.  Thus  if  x  is  terminal  miss  distance,  then  an 
approximate  100  (1-a)  percent  confidence  interval  on  mean  terminal  miss  distance 
is 


x  - 


ta/2,n-ls/v^  i  V  i  5  +  ta/2,n-ls/’^ 


where  ta/2,n-l  t*1®  uPP«r  a/2  percentage  point  of  the  t  distribution  with  n-1 
degrees  of  freedom.  Comparison  of  this  confidence  interval  with  design  specifi¬ 
cations  or  with  values  observed  in  flight  test  may  prove  helpful  in  assessing 
the  validity  of  the  model.  For  an  introduction  to  statistical  methods  in  data 
analysis,  see  Hines  and  Montgomery  (8),  Montgomery  (13),  and  Draper  and  Smith 
(4).  An  example  of  the  use  of  statistical  methods  in  simulation  model  valida¬ 
tion  is  in  Naylor,  Wallace  and  Sasser  (18). 

When  the  simulator  is  nonstoohastic  or  when  replication  is  prohibitively 
expensive,  statistical  methods  can  still  be  helpful  in  parameter  estimation. 

For  example,  if  the  simulator  produces  a  sequence  of  time-oriented  observations 
on  the  variable  of  interest,  then  one  may  obtain  reasonably  good  estimates  of 
the  mean  and  variance  and  confidence  intervals  either  by  approximating  the  out¬ 
put  as  an  autoregressive  process  and  obtaining  estimates  of  the  parameters  using 
time  series  methods  or  by  breaking  the  output  stream  into  k  batches  and  treating 
the  mean  of  each  batch  as  a  single  observation.  This  latter  approach  is  often 
called  the  method  of  batch  means.  For  further  details,  see  Fishman  (5). 

Goodness  of  Fit  Testing  -  This  approaoh  to  model  validation  involves  testing  the 
hypothesis  that  the  entire  sample  of  data  generated  by  a  computer  simulation 
model  has  the  same  probability  distribution  as  the  sample  of  data  observed  in 
the  flight  test.  Thus  our  attention  is  now  focused  on  the  conformance  of  the 
entire  distribution  of  sample  data  from  the  simulation  with  the  flight,  and  not 
just  on  the  parameters  of  these  distributions.  The  two-sample  Kolomogorov- 
Smlraov  test  is  a  distribution-free  test  that  is  highly  useful  in  this  regard. 

Por  further  reading  on  goodness  of  fit  testing,  see  Conover  (3). 

Goodness  of  fit  testing  may  often  be  viewed  as  an  improvement  over  static 
data  analysis  methods  for  validation.  It  is  entirely  possible  that  two  distribu¬ 
tions  have  Identical  sample  momenta  (or  means  and  variances)  but  differ  con¬ 
siderably  in  shape.  Goodness  of  fit  testing  is  designed  to  help  detect  such  a 
situation.  Its  only  weakness  is  that  most  goodness  of  fit  tests  require  inde¬ 
pendent  observations  (random  samples)  and  many  computer  simulations  produce  out¬ 
put  streams  that  are  highly  autooorrelated.  Consequently,  this  autocorrelation 
may  render  the  goodness  of  fit  testing  approaoh  less  useful. 
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Tine  Series  Analysis  Method  -  The  output  variables  of  Interest  in  many  studies 
of  missile  systems  is  represented  by  a  time  series.  Let  xt  be  the  time  series 
of  interest  observed  in  the  actual  flight  test  and  yt  be  the  corresponding  time 
series  generated  by  the  computer  simulation  model,  tsl,2,...,T.  These  time 
series  are  usually  highly  autocorrelated ,  and  may  exhibit  other  internal  struc¬ 
ture  (such  as  non-stationarity,  the  presence  of  deterministic  components,  etc.). 
To  validate  the  computer  simulation  model,  we  must  test  the  hypothesis  that  the 
two  time  series  xt  and  yt  ar®  equivalent. 

A  variety  of  methods  can  be  used  to  compare  the  time  series  x^  and  yt*  In 
the  validation  of  missile  systems,  nonstatistlcal  methods  are  sometimes  used. 

The  most  common  method  of  nonstatistlcal  comparison  involves  plotting  the  time 
series  xt  and  yt,  overlaying  the  plots,  and  sliding  them  along  until  as  close  a 
match  as  possible  is  obtained.  Then  the  analyst  determines  subjectively  whether 
or  not  the  output  time  series  from  the  simulator  agrees  with  the  flight  test 
results.  A  major  difficulty  with  this  approach  is  that  it  does  not  quantify  the  ' 
risk  associated  with  any  decision,  and  it  is  entirely  possible  that  different 
analysts  will  arrive  at  different  conclusions. 

Another  nonstatistlcal  procedure  sometimes  used  in  validating  computer 
simulation  models  is  Theil's  inequality  coefficient  (19) (20) (21).  This  coef¬ 
ficient  is  an  index  which  measures  the  conformance  of  one  time  series  with 
another.  Theil's  inequality  coefficient  has  been  extensively  used  to  validate 
computer  simulation  models  of  missile  systems  (for  example,  see  Kheir  and  Holmes 
(12)).  While  this  procedure  is  more  quantitative  than  simple  visual  comparison 
of  time  series,  there  is  no  standard  distribution  theory  for  Theil's  inequality 
coefficient,  and  so  no  statistical  statements  relative  to  the  conformance  of  the 
two  time  series  can  be  made. 

Several  statistical  methods  may  be  useful  in  comparing  time  series.  One 
approach  is  to  fit  an  appropriate  stochastic  model  to  xt  and  yt,  usually  an 
autoregressive  integrated  moving  average  model  (see  Box  and  Jenkins),  (2),  and 
then  compare  the  two  models.  If  the  two  models  are  the  same,  the  inference  is 
that  the  two  time  series  are  the  same.  A  test  for  the  equivalence  of  two  time 
series  models  is  described  by  Hsu  and  Hunter  (10),  who  also  illustrate  the  use 
of  the  procedure  in  validating  a  computer  simulation  model  of  an  airport. 
Unfortunately,  the  two  time  series  could  have  been  generated  by  the  same 
underlying  stochastic  model  and  still  differ  significantly  in  certain  charac¬ 
teristics,  particularly  over  the  relatively  short  records  typically  associated 
with  time  series  obtained  from  missile  systems.  For  example,  the  two  series 
oould  be  significantly  out  of  phase,  and  yet  both  oould  have  been  generated  from 
the  same  AR(2)  model  (say).  Furthermore,  differences  in  phase  angle,  gain  and 
frequency  usually  have  speoific  interpretations  to  the  missile  designer. 
Therefore,  he  would  like  to  know  if  such  differences  are  present. 

Spectral  methods  have  been  suggested  by  many  authors  for  validation  of 
oomputer  simulation  models  (6)(14)(18).  The  general  approaeh  oonsists  of  com¬ 
paring  the  sample'  spectra  of  the  simulation  model  output  and  the  corresponding 
flight  test  data  to  infer  how  well  the  simulation  matches  the  flight. 

The  spectrum  of  a  time  series  xt,  say  •  I®  *  decomposition  of  the 

total  variance  of  the  series  by  frequency  over  the  interval  0  <.  cj  <,  it,  Thus 
♦jojtw)  measures  the  variance  contribution  to  xt  at  frequency  ui.  The  spectrum  is 

« 
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related  to  the  autocovariance  function  of  a  wide-sense  stationary  time  series  by 
the  relationship 


4>  (to) 
xx 


Y0  +  2  l  Yk  cos(uk) 

k“l 


(1) 


where  {Yk},k=0,l,2, . . .  is  the  autocovariance  function  of  the  time  series  Xf 
Thus  the  spectrum  is  the  Fourier  cosine  transform  of  the  autocovariance  func¬ 
tion.  The  spectrum  is  estimated  by  the  sample  spectrum 


fxx(V  "  Vo  +  2  Vk  cos(wjk)  (2) 

where  fXx(oij)  is  an  estimate  of  the  spectrum  averaged  over  a  band  of  frequencies 
centered  at  ojj  =  iTj/m,  j=09l9...,m9  m  is  the  number  of  frequency  bands  esti¬ 
mated,  Xk>  ts09l9*ee9m  are  a  set  of  constants  or  weights,  and 

1  T-k 

CkmT-k  l  (*t-*)(* t+k-x),  k-1,2 . ra 

t»l 

is  the  sample  autocovariance  function.  The  weights  {x  depend  on  the  type  of 
spectral  window  used  in  the  estimation  process  (see  Fuller  (7)  and  Jenkins  and 
Watts  (ID).  Spectral  windows  are  employed  to  give  a  smoother  estimate  of  the 
spectrum. 


Let  f3ac(ojj)  denote  the  sample  spectrum  of  the  flight  test  data  and  fyy(uij) 
denote  the  sample  spectrum  of  the  simulation  output.  To  compare  these  spectra 
at  a  specific  frequency  construct  a  100(1  -  a)  percent  confidence  interval  on 
the  ratio  of  the  true  spectra,  say  $xx(<»j),  using 


Fa/2,k,k 


±**x(V/ VV  - 


F 


l-o/2,k,k 


(3) 


J-0,1, 2,...  ,m 

where  is  the  pth  percentage  point  of  the  F  distribution  with  k  =  2T/m 

degrees  of  freedom  in  the  numerator  and  denominator.  This  succession  of  con¬ 
fidence  intervals  at  the  frequency  points  uj,  js0,l,...,m  is  called  a  confidence 
band.  If  the  upper  and  lower  confidence  limits  contain  the  value  ♦i(wj)/$2(wj) 

=  1,  then  we  conclude  that  at  that  frequency  the  two  time  series  are  identical. 

For  the  time  series  to  be  identical,  their  spectra  must  be  equal  at  all 
frequencies  (oj»  Js0,l,...,m.  The  simultaneous  oonfldenoe  band  allows  us  to 
state  with  a  probability  at  least  1-ct  that  all  m+l  confidence  intervals  are 
simultaneously  true.  The  100(l-a)  percent  simultaneous  confidence  band  is  com¬ 
puted  from 
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(4) 


For  example,  if  we  wished  to  make  the  statement  that  all  m+1  frequencies  are 
simultaneously  equal  with  a  probability  of  at  most  0.05  (that  is,  a  95  percent 
simultaneous  confidence  interval),  then  the  probability  level  associated  with 
the  f-value  at  each  frequency  is  0.05/2 (m+1)  =  0.025/(m+l).  Thus  if  there  are 
16  bands  in  the  sample  spectrum,  then 

0.025  =  0.00156 
16 

probability  to  each  tail  of  the  F  distribution.  Note  that  the  simultaneous  con¬ 
fidence  band  (4)  is  wider  than  the  single  confidence  interval  (3)  at  any  fre¬ 
quency.  For  an  application  of  this  methodology  to  the  validation  of  a  computer 
simulation  model  of  a  missile  system,  see  Montgomery  and  Conard  (14). 

In  constructing  confidence  intervals ,  the  analyst  must  specify  a ,  the 
Type  I  error  rate.  In  other  words,  a  is  the  probability  that  we  will  conclude 
that  the  simulation  and  flight- test  data  differ  when  they  really  do  not.  A 
reasonable  range  of  values  is  0.01  <  a  <  0.10.  Values  of  a  >  0.1  imply  that 

it  is  relatively  easy  to  find  that  the  flight  test  and  the  simulation  differ 
when  they  really  do  not,  while  values  of  <*  <  0.01  imply  that  it  is  easy  to 
conclude  that  the  flight  test  and  the  simulation  match  when  they  really  do  not. 
Note  that  the  confidence  intervals  (3)  and  (4)  became  wider  as  a  becomes 
smaller.  Thus  very  small  values  of  a  make  it  easier  to  conclude  that  the  two 
time  series  agree. 

Spectral  methods  can  be  applied  only  to  a  stationary  series.  If  the  series 
is  nonstationary,  then  the  nonstationary  part  of  the  prooess  must  be  removed, 
either  by  successive  differencing  or  by  fitting  a  polynomial  model  (or  other 
appropriate  function)  to  the  data  and  analyzing  the  residuals.  Piecewise  poly¬ 
nomial  fitting  may  be  necessary  when  the  time  series  exhibits  different  behavior 
in  different  local  segments  of  time.  This  may  be  conveniently  done  using 
splines.  Several  useful  references  are  in  Draper  and  Smith  (4).  Indications  of 
nonstationarity  are  usually  observed  in  either  the  sample  autocorrelation  func¬ 
tion  or  the  sample  spectrum.  If  the  autocorrelation  funotion  does  not  die  down 
even  at  very  long  lags  or  if  the  power  is  oonoentrated  at  the  lowest  frequency 
in  the  spectrum,  then  the  series  is  probably  nonstationary.  Note  that  if  two 
nonstationary  series  x^  and  yt  are  compared ,  they  are  equivalent  in  a  frequency 
sense  if  both  their  stationary  representations  have  the  same  spectrum  and  if  the 
same  level  of  differencing  (or  the  same  order  polynomial)  is  required  to  reduce 
both  of  them  to  stationarlty. 

Internal  Validity  Checking  -  While  simulation-to-fllght  test  comparisons  form 
the  basis  of  simulation  model  validation,  it  is  also  usually  necessary  to  vali¬ 
date  the  internal  logio  of  the  simulation  model.  Spectral  methods  can  be  useful 
in  this  aspect  of  validation  also.  These  techniques  can  be  used  to  investigate 
the  interrelationships  between  two  time  series  generated  by  the  simulator.  For 
example,  suppose  that  the  ocmputer  simulation  model  produoes  time  series  output 


of  fin  deflection  and  airframe  lateral  acceleration.  The  lateral  accelerations 
are  a  physical  result  of  the  fin  deflections  and  the  nonlinear  aerodynamic 
response  of  the  airframe.  Causal  or  correlative  structure  between  these  two 
time  series  should  be  reflected  in  the  analysis. 

Interrelationships  between  two  time  series  x^  and  yt  are  based  on  the 
cross-spectrum.  The  cross-covariance  function  is 


•Y_00 

xy 


E(xt-V(ywfc-  V 


(5) 


where  x  and  y  are  the  means  of  xt  and  y^  respectively.  Unlike  the  autocorre¬ 
lation  function  for  a  single  time  series,  the  cross-eorrelatlon  function  xy(k) 
may  not  be  symmetric  about  zero.  The  oross-spectrum  is  defined  as 


Y  (k)e 
'xy 


-iUJk 


(6) 


Rote  that  xy(  )  is  a  continuous  periodic  function  of  (the  frequency).  Since 
xy(k)  may  not  be  symmetric  about  zero,  the  cross-spectrum  is  in  general  a 
complex  function,  say 


”  CjjytW)  -  i  q^Od), 


(7) 


where  Cjty(  )  is  the  coincident  spectral  density  (cospectrum)  and  qXy(  )  is  the 
quadrature  speotral  density.  Both  (W  }  and  qxy(  )  are  real-valued  functions 
of  .  Cgy(  )  is  the  cosine  portion  of  the  transform  and  is  an  even  function  of 
,  while  )  is  the  sine  portion  of  the  transform  and  is  an  odd  function  of 
.  Let  fgy(  )  denote  an  estimate  of  the  cross-spectrum  and,  let  C^yt  )  and 
q*y(  )  denote  the  estimates  of  Cgyt  )  and  qxy(  )  respectively.  For  an  Introduc¬ 
tion  to  the  estimation  problem  see  Fuller  (7)  and  Jenkins  and  Watts  (11). 

The  squared  ooherenov  is  defined  as 


2 

xy 


<«) 


(8) 


l^xy(“) |2  "  C^y(W> 


where 


and  <f)3cx(a))  and  ♦yy(a))  are  the  spectra  of  xt  and  yt,  respectively.  The  coherency 
is  analogous  to  the  coefficient  of  multiple  determination  R2  in  multiple 
regression.  Thus  coherency  is  a  measure  of  independence  of  x^  and  y^  at  fre¬ 
quency  a).  If  coherency  is  0,  then  the  series  are  independent  (unrelated),  while 
if  coherency  is  1,  then  the  series  are  perfectly  dependent  (related).  Coherency 
is  a  nondimensional  measure  of  the  correlation  between  two  time  series  as  a 
function  of  frequency.  Another  way  to  think  of  coherency  is  in  terms  of  the 
predictability  of  one  series  from  the  other.  If  coherency  is  0,  then  one  series 
cannot  be  predicted  from  the  other,  while  if  coherency  is  1,  one  series  can  be 
perfectly  predicted  from  the  other.  We  may  also  think  of  coherency  as  the  pro¬ 
portion  of  the  total  power  (by  frequency)  in  one  time  series  that  can  be 
explained  by  the  other  time  series. 


An  F-test  for  zero  coherency  is  given  by 

Ad  K2  (u>) 

F.  - - sy_ 

2[1  -  K^CuO] 


(9) 


which  if  K2xy(u) )  s  0  is  distributed  as  F2  f  4<j ,  where  d  is  the  number  of  points  at 
which  the  spectrum  is  estimated,  and  * 


K2  (w) 
xy 


(10) 


is  an  estimate  of  coherency.  If  Fg  >  Fa  2,4d»  the  hypothesis  of  zero  coherency 
is  rejected.  While  the  limiting  values  or  coherency,  0  and  1,  are  of  obvious 
interest,  intermediate  values  are  also  of  interest  because  of  the  natural 
Interpretation  of  coherency  as  a  "percent  variability  explained".  If  the 
coherency  function  is  greater  than  zero  but  less  than  unity,  one  or  more  of 
three  possibilities  exist 

1.  Extraneous  noise  is  present  in  the  measurements, 

2.  The  system  relating  xt  and  yt  is  not  linear,  or 

3.  yt  is  an  output  related  to  the  input  xt  as  well  as  to  other  inputs. 

The  gain  of  yt  over  xt  is  defined  as 


A  («)  . 

G  (&>)  -  -  '^1  r  -  - 5 X. 


xy 


ici»  +  y(“))1/2 


♦  («) 


(11) 


* 
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The  gain  behaves  like  a  regression  coefficient  in  the  regression  of  y  on  x 
through  the  origin,  but  it  is  now  evaluated  at  frequency  w .  That  is,  gain 
measures  the  increase  in  amplitude  of  y^  over  that  of  x^  at  frequency w.  One 

may  construct  a  lOO(l-a)  percent  confidence  interval  on  gain,  using 


max{0,  G  (u>)  -  A}  <  G  (w)  <  G  (w)  +  A 
xy  —  xy  —  xy 


where 


4  •  km  ♦  i)\wy.iF„A/! 


g  w  .  ^ 

*»  f»<") 


(12) 


and 


fZZ(U))  =  fyy(aj)tl  ”  gxy<Ul)1  ^ 


The  phase  spectrum  is  defined  as 


(13) 


”  tan_l 


(01)1 


and  estimated  by 


♦^(u)  -  tan'1  l-q^O^/C^w)]. 


(1U) 


The  phase  spectrum  shows  whether  the  frequency  components  in  one  series  lead  or 
lag  the  components  at  the  same  frequency  in  the  other  series.  If  the  coherency 
is  zero  at  frequency  co ,  that  implies  that  a  100 (1-cO  percent  confidence  interval 
on  the  phase  angle  is  (-tt/2 ,  ir/2).  That  is,  the  average  phase  difference  bet¬ 

ween  the  two  processes  is  zero,  but  the  phase  difference  is  equally  likely  to 
lie  anywhere  in  the  range  (-ir/2,  ir/2).  If  coherency  is  not  zero,  then  a 
100 (l-o)  percent  confidence  interval  is 

♦jtyfa)  “  5  £  4^  0*0  —  ‘f’xy^10)  +  (15) 
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where 


6  =  sin 


-1 


a,4d 


aO 

1  -  JT  (to) 
_ _2E _ 

4d  K2  (to) 
xy 


1/2 


Generally  speaking,  the  cross  correlation  structure  between  two  time 
series  can  be  adequately  described  by  their  squared  coherency  and  phase  spectra. 
Therefore,  it  is  recommended  that  internal  validity  checking  concentrate  on 
these  measures. 

Screening  and  Preparation  of  Data  -  A  potentially  frustrating  problem  for  the 
data  analyst  is  dealing  with  wild  or  unusual  observations  in  either  the  observed 
flight  data  or  the  simulation  data.  These  wild  or  aberrant  observations  may 
severely  distort  the  sample  spectrum  or  estimates  of  the  parameters  of  the 
underlying  distributions.  Often  we  find  that  some  type  of  data  editing  or  pre¬ 
liminary  screening  of  the  data  is  necessary. 

Two  approaches  are  useful  in  this  regard.  The  first  of  these  is  to  smooth 
the  data  with  a  nonlinear  robust  filter  to  eliminate  spikey  noise.  The  more 
popular  nonlinear  smoothers  are  usually  based  on  running  medians  (see  Tukey  (22) 
and  Velleman  (25).  A  second  approach  is  to  fit  a  model  to  the  data  that  descri¬ 
bes  the  smooth  portion  of  the  signal.  Fitting  methods  more  robust  than  least 
squares  are  recommended  in  this  approach.  It  is  well-known  that  least  squares  is 
severely  distorted  by  outliers.  Robust  fitting  procedures  are  discussed  by 
Hogg  (9)*  Agee  and  Turner  (1)  describe  the  use  of  these  methods  in  prepro¬ 
cessing  of  missile  trajectory  data. 

Examples  -  The  validation  and  analysis  methodology  presented  in  the  previous 
sections  has  been  implemented  with  a  FORTRAN  IV  computer  program.  This  program 
and  its  operation  are  described  in  more  detail  in  the  Appendix.  In  this  section 
we  will  illustrate  the  use  of  the  methodology  with  several  examples. 

Examples  Pslng  Simulated  Data  -  In  order  to  check  program  operation  and  investi¬ 
gate  the  effectiveness  of  the  proposed  methods,  sample  time  series  were 
generated  from  stochastic  processes  having  known  spectral  properties.  The  abi¬ 
lity  of  the  procedure  to  produce  sample  spectral  estimates  matching  the  popula¬ 
tion  values  reasonably  closely  is  then  investigated. 

A  plot  of  two  realizations  of  length  200  from  the  AR(2)  process 


xt-l  "  0,5xt-2  +  at 


at  NID(0,Oa) 
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is  shown  in  Figure  1.  The  theoretical  autocorrelation  function  and 
spectrum  are  shown  in  Figure  2.  The  sample  autocorrelation  functions  for  both 
realizations  are  shown  in  Figures  3  and  4.  Both  plots  exhibit  the  charac¬ 
teristic  pseudocyclic  decay  associated  with  the  AR(2)  process.  Figures  5  and  6 
present  the  smoothed  spectral  estimates  for  truncation  points  of  m  •  8,  16,  and 
32.  At  m  -  8,  the  low-frequency  power  in  the  series  is  easily  seen,  but  there 
is  no  indication  of  the  peak.  Increasing  m  to  16  gives  some  indication  of  the 
peak  for  both  realizations.  At  m  *  32,  other  peaks  occur  because  of  the 
increase  in  variance  of  the  estimate.  While  it  is  difficult  to  obtain  a  good 
estimate  of  a  narrow  two-sided  peak  with  a  realization  of  length  200,  we 
conclude  that  m  =  16  or  perhaps  m  =  24  would  be  adequate  to  show  the  major 
features  of  the  spectrum. 

A  comparison  of  the  two  spectra  using  equations  (3)  and  (4)  with  a  trun¬ 
cation  point  m  =  24  is  shown  in  Table  1.  Individual  95  percent  confidence 
limits  on  the  ratio  of  the  spectra  are  shown  in  columns  (e)  and  (f),  while  the 
corresponding  simultaneous  confidence  limits  are  shown  in  columns  (g)  and  (h). 
The  simultaneous  confidence  limits  indicate  agreement  between  the  spectra  at  21 
of  the  25  tabulated  points,  and  only  one  of  the  four  missmatched  points 
corresponds  to  a  frequency  in  the  spectrum  having  significant  power. 

Considering  the  difficulty  in  estimating  a  spectrum  with  a  narrow  two-sided 
peak,  we  conclude  that  there  is  no  strong  evidence  that  the  two  series  are  not 
realizations  of  the  same  underlying  stochastic  process. 

The  second  simulated  example  consists  of  100  realizations  of  the  bivariate 
linear  process 

xt  -  +  at 


yt  «  0.4*^  +  O.Sy^  +  bfc 


at  ^  NID(0,1),  bt  ^  NID(0,1) 


described  in  Jenkins  and  Watts  (11).  Both  series  are  plotted  in  Figure  7*  The 
sample  autocorrelation  functions  and  spectra  (with  truncation  points  m  s  8,  16 
and  32)  for  x^  and  y^  are  shown  in  Figures  8,  9,  10,  and  11,  respectively.  Note 
that  although  the  bivariate  process  is  first  order,  the  autocorrelation  function 
and  spectrum  resemble  those  of  a  second-order  univariate  process.  Furthermore, 
both  series  have  the  same  general  behavior,  with  a  peak  or  trough  in  x^  followed 
by  a  corresponding  peak  or  trough  in  y^  after  one  or  two  observations.  This  is 
confirmed  from  inspection  of  the  sample  cross  correlation  function  shown  in 
Figure  12,  which  shows  larger  spikes  at  positive  lags  1  and  2. 

The  comparison  of  the  univariate  spectra  is  shown  in  Table  2,  based  on  the  trun¬ 
cation  point  m  s  16,  and  with  a  significant  level  of  95  percent.  Inspection  of 
the  simultaneous  confidence  limits  in  columns  (g)  and  (h)  reveal  that  the  two 
series  closely  agree. 
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Plot  of  autocorrelations 


XT 


> 


Spectrum  of  series  1  and  2 


Figure  10.  Sample  Spectrum  for  xtvlthm*  8,  16,  and  32. 
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(b) 

X  SPECTRUM 
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.272040 
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.089630 
.088248 
.087765 
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_ (c) 

Y  SPECTRUM 
1.644083 
1.713186 
1.475656 
1.453755 
1.603301 
1.762093 
1.543782 
1.718659 
1.173977 
.660522 
.415672 
.196577 
.155779 
.128775 
.090467 
.056107 
.079530 
.100406 
.056462 
.046995 
.049912 
.059653 
.064972 
.078205 
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(d) 

RATIO 
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.774454 

.768072 

.576575 

.786292 
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.459425 

.958941 
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TABLE  1 

Comparison  of  Si 

_ (e) 

IHDIV  Cl  LB 
.553752 
.517856 
.513588 
.385539 
.525772 
.501913 
.385239 
.307204 
.641217 
.910530 
1.056416 
1.360821 
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1.412583 
1.425774 
.914135 
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.593255 
1.061478 
1.255660 
1.175797 
1.023869 
.603221 
.486617 
.694617 


>ectra 

(f) 

IHDIV  Cl  UB 
1.236480 
1.158197 
1.148652 
.862268 
1.175901 
1.122541 
.861597 
.687070 
1.434097 
2.036423 
2.362699 
3.043508 
2.270131 
3.159275 
3.188778 
2.044486 
1.264078 
1.326830 
2.374021 
2.808313 
2.629699 
2.289909 
1.349118 
1.088330 
1.553528 


(g) 

SIMP  Cl  LB 
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.304722 
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(h) 

SIMU  Cl  UB 
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1.453293 
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2.872206 
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4.034494 
2.586717 
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1.201867 

1.270897 
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.841212 
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.591308 

1.518624 

.474889 

1.704283 

.287546 

1.543914 

.182383 

1.535845 

.138035 
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.181256 
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TABLE  2 


Comparison  of  Spectra 


(e) 

(f) 

_  (g) 

(h) 

INDIV  Cl  LB 

INDIV  Cl  UB 

SIMU  Cl  LB 

SIMU  Cl  UB 

.480849 
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.511936 
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3.231109 
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3.626128 

.941706 

2.531224 
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3.284916 

.936785 
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.721848 

3.267748 

.791355 

2.127094 
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2.760453 

.484672 

1.302756 

.373469 

1.690663 

.600978 

1.615376 

.463089 
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1.983018 
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2.573478 

.542779 

1.458942 
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1.893354 

.511135  1.373884  .393860  1.782970 

.620245  1.667163  .477935  2.163574 


The  cross  correlative  nature  of  the  two  series  can  be  investigated  by 
decomposing  the  cross  spectrum  into  squared  coherency  and  phase  components.  The 
theoretical  coherency  and  phase  spectra  for  the  bivariate  linear  process  are 
shown  in  Figure  13-  Figures  14  and  15  show  the  corresponding  sample  spectra. 

The  sample  coherency  spectrum  in  Figure  14,  based  on  a  truncation  point  of  m  = 
16,  shows  reasonably  stable  behavior  and  agrees  closely  with  the  theoretical 
spectrum.  The  most  important  features  of  the  coherency  spectrum  is  the  single 
large  peak  and  the  tendency  toward  zero  at  both  low  and  high  frequencies.  Thus, 
the  majority  of  the  correlation  between  the  two  series  is  centered  in  the  mid¬ 
range  frequencies.  The  sample  phase  spectrum,  Figure  15,  shows  good  agreement 
with  the  theoretical  phase  spectrum.  The  phase  spectrum  shows  that  low  fre¬ 
quency  components  of  xt  lag  those  of  yt  by  approximately  90  degrees  but  that 
the  phase  difference  tends  to  zero  at  higher  frequencies. 

An  Example  Using  Real  Data  -  We  will  now  illustrate  simulation  model  validation 
using  time  series  data  from  a  control  variable  for  a  typical  missile.  Figure  16 
presents  800  realizations  of  simulation  data  (dashed  line)  and  flight  test  data 
(solid  line).  The  visual  impression  is  that  both  series  agree  closely.  The 
autocorrelation  functions  and  spectra,  shown  in  Figures  17,  18,  19,  and  20  are 
typical  of  those  associated  with  AR(2)  processes.  Each  spectra  is  computed 
using  truncation  points  of  m  =  8,  16,  and  32.  There  is  evidence  of  a  narrow 
peak  in  the  low  frequency  range. 

Table  3  presents  the  statistical  comparison  of  the  spectra  for  a  trun¬ 
cation  point  of  m  =  32.  The  95  percent  simultaneous  confidence  limits,  shown  in 
columns  (g)  and  (h),  indicate  that  at  all  points  in  the  spectrum  containing 
significant  power  there  is  good  agreement  between  the  simulation  and  flight  test 
data.  The  only  simultaneous  confidence  limits  that  do  not  include  one  are  at 
higher  frequencies  where  the  power  is  low.  We  conclude  that  the  simulation 
model  and  the  flight  test  data  are  substantially  equivalent. 


Plot  of  autocorrelations 
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Figure  18.  Saaple  ACT,  Flight  Test  Data 


Spectrum  of  aeries  1  and  2 
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FREQUENCY  (RADIANS) 

Figure  20.  Smoothed  Spectral  Estimates,  Flight  Test  Data. 
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_ (c) 

Y  SPECTRUM 
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10.983929 
14.166088 
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.019926 
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RATIO 
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3.743357 
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25.596934 
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arisen  of  Spectra 


_ (•*._ 

IHDIV  Cl  LB 

.717275 
.712091' 
.713544 
.722894 
.734860 
.749162 
.771602 
.802611 
.828666 
.859705 
.903812 
.958451 
1.017992 
1.094312 
1.172987 
1.303041 
1.425102 
1.602123 
1.693215 
1.938886 
2.094104 
2.643117 
2.841948 
3.573057 
3.723345 
6.394890*  ] 

6.577006  ] 

16.275253  ; 

10.941420  2 

38.976502  1 

16.029325 
16.029325 
18.073535  2 


ill 

i  IHDIV  Cl 
1.438717 
1.428319 
1.431233 
1.449988 
1.473989 
1.502676 
1.547687 
1.609885 
1.662145 
1.724404 
1.812875 
1.922471 
2.041898 
2.194982 
2.352788 
2.613653 
2.858482 
3.213554 
3.396266 
3.889035 
4.200374 
5.301589 
5.700406 
7.166871 
7.468321 
12.826931 
13.192221 
32.645058 
21.946405 
78.179438 
32.151774 
32.151774 
36.252068 


B  SIMP  Cl 
.577091 
.572920 
.574088 
.581611 
.59123 
.602745 
.620800 
.645748 
.666711 
.691684 
.727171 
.771131 
.819035 
.880440 
.943738 
1.048374 
1.146579 
1.289004 
1.362292 
1.559949 
1.684831 
2.126545 
2.286517 
2.874737 
2.995653 
5.145070 
5.291593 
13.094411 
8.803025 
31.358918 
12.896548 
12.896548 
14.541236 


_ (h) 

■  SIMP  Cl  UB 
1.788205 
1.775281 
1.778902 
1.802213 
1.832045 
1.86770 
1.923645 
2.000952 
2.065907 
2.143290 
2.253252 
2.389470 
2.537908 
2.728179 
2.924318 
3.248551 
3.552853 
3.994178 
4.221274 
4.833745 
5.220712 
6.589430 
7.085127 
8.907819 
9.282496 
15.942798 
16.396823 
40.575066 
27.277539 
97.170475 
39.961954 
39.961954 
45.058276 


APPENDIX 


COMPUTER  PROGRAMS 

The  COMP  Programs  were  written  in  FORTRAN  IV  extended  for  implementation 
on  the  CDC  CYBER  74,  utilizing  a  Tektronix  4014  Graphics  Terminal.  Routines 
from  the  International  Mathematical  and  Statistical  Libraries  (IMSL)  and  the 
Directorate  for  Management  Information  Systems  (DMIS)  General  Purpose  Computer 
Subroutines  (ALTLIB)  augment  the  standard  CDC  Library  Routines.  The  programs 
were  designed  to  be  run  in  the  interactive  mode  with  a  minimum  core  requirement 
of  llOOOOg. 

The  first  program,  C0MP1,  is  a  preprocessor.  Using  it,  the  analyst  may 
determine  the  "best"  set  of  conditions  (smoothing,  differencing,  window  size, 
etc.)  for  running  the  full  spectral  analysis  program.  C0MP1  plots  the  original 
series  and  calculates  the  autocorrelations.  Depending  on  the  interactive  com¬ 
mands,  one  or  both  of  the  following  options  may  be  selected, 

(1)  smooth  the  data  using  either  a  running  median  scheme  or  a  median/ 
hamming  technique 

(2)  difference  or  fit  a  curvilinear  regression  model  to  original  or 
smoothed  series. 

The  spectrum  for  the  original  series,  the  differenced  or  the  residuals  from  the 
curvilinear  fit  is  then  calculated  and  plotted. 

A  hamming  window  is  used  for  the  spectrum  calculations.  Three  spectral 
window  sizes  (eg.  8,  16,  32)  are  allowed.  The  curvilinear  regression  routine, 
as  set-up  in  C0MP1,  fits  a  model  of  the  form 

yt  “  *o  +  Bic  +  B2t2  +  B3*t  +  at 

where 


z 


t 


t  <  t* 


t-t*  t  >  t* 


and  t*  is  the  t  value  at  which  a  change  in  the  curve  is  observed. 

This  program  calculates  the  Durbin-Watson  statistic,  the  basis  of  a  test 
of  autocorrelation  in  regression  analysis  (see  Draper  and  Smith)  (4).  In  addi¬ 
tion,  the  value  of  Theil’s  Inequality  Coefficient  (TIC)  along  with  Measures  for 
use  in  determining  if  the  difference  between  the  two  series  is  a  result  of  a 
time  lag,  scale  or  bias  error  are  provided. 
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The  second  and  third  programs,  C0MP2  and  C0MP3,  are  run,  using  the 
experience  gained  from  running  C0MP1,  for  the  full  spectral  analysis.  COMP 3 
makes  the  same  calculations  as  C0MP1  and  calculates  both  the  100 (1-ot)  percent 
individual  and  simultaneous  confidence  intervals  on  the  ratio  of  the  true 
spectra,  the  coherency  function,  the  cross  spectra,  phase  and  gain.  This 
program  is  particularly  useful  when  comparing  series  for  some  causal  rela¬ 
tionship.  If  there  is  no  interest  in  the  interrelationship  between  the  two 
series  one  may  ran  C0MP2.  C0MP2  does  not  output  the  cross  spectral  results. 


Input  to  these  programs  is  from  the  disc  (TAPE1,  TAPE  2,  TAPE  3)  and  the 
console.  TAPE  1  and  TAPE  2  are  data  files.  TAPE  3  contains  the  labels  to  be 
used  with  the  graphics,  the  format  for  reading  the  data  files,  etc.  The 
following  is  an  example  of  the  structure  of  TAPE  3: 


0  0. 

(F14.8) 

LAG 

MAGNITUDE 
PLOT  OF  AUTOCORRELATIONS 
PLOT  OF  DIFF/DETREND  SERIES 
CROSS  SPECTRUM  OP  THE  SERIES 
PLOT  OF  GAIN 


AUTOCORRELATION  FREQUENCY 

SERIES  TIME 

PLOT  OF  ORIGINAL  SERIES 
SPECTRUM  OF  SERIES  1  AND  2 
PHASE  OF  THE  CROSS  SPECTRUM 
PLOT  OF  COHERENCY 


SPECTRUM 

COHERENCY 


RECORD  1 

ISKIP,  XSTAR 


110,  F10.0 


RECORD  2 

DFORM  8A10 

RECORD  3 

AL1,  AL2,  AL3,  AL4  8A10 

RECORD  4 

AL5,  AL6,  AL7,  AL8  8A10 


RECORD  5 

TITL1,  TITL2  6A10 

RECORD  6 

TITL3,  TITL4  6A10 


RECORD  7 

TITL5,  TITL6  6A10 

RECORD  8 

TITL7,  TITL8  6A10 
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LIST  OF  INPUT  VARIABLE  NAMES 


AL1,AL2, 
TITL1, . . 


ISKIP  -  Number  of  RECORDS  to  skip  on  the  data  files 

XSTAR  -  Regression  breakpoint 

DFORM  -  Format  for  reading  the  data  files 

NV  -  Number  of  variables  in  the  data  for  each  series,  NV  is  L.E.2 

ISIZE  -  Number  of  observations  for  each  variable  ISIZE  is  L.E.  250 

X, Y  -  NV  x  250  data  arrays 

MS  -  Number  of  the  variable  to  compare 

ISM  -  Smoothing  indicator,  1  =  YES,  0  =  NO 

ICNT  -  Type  smoothing,  1  *  MEDIAN,  2  =  MED/HAMMING 

K  -  Number  of  autocorrelations  to  compute  or  number  of  lags 
for  FTFREQ 

ID  -  Difference/spectra  plot/analysis  complete  indicator 
. ..,AL8  -  Labels  for  ABSCISSA  and  ordinate  of  plots 
.  ,TITL8  -  Titles  for  plots 

Output  is  both  printed  format  on  file  output  and  special  graphics. 
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